Location data integration and management

ABSTRACT

A system and associated methodology manages localization data. According to one embodiment of the present invention a primary set of data associated with one or more specific locations is imported and matched to a predefined format. Thereafter external, secondary, data associated with each of the localities listed in the primary data set is collected from a plurality of third party location service providers. With the collected secondary data matched to the same predefined format a comparison is made between the plurality of secondary data sets and the primary data set. Differences between the data sets are identified and the primary data set modified as necessary. Thereafter, normalized data from the modified primary data set is exported to the third party location service providers to enhance consistency and reliable of locational data.

RELATED APPLICATION

The present application relates to and claims the benefit of priority toU.S. Provisional Patent Application No. 61/866,970 filed 16 Aug. 2013which is hereby incorporated by reference in its entirety for allpurposes as if fully set forth herein.

BACKGROUND OF THE INVENTION

1. Field of the Invention

Embodiments of the present invention relate, in general, toaccessibility and usability of locational data and more particularly tosystems and processes to increase the accuracy and reliably oflocational data.

2. Relevant Background

One of the most widespread uses of geocoding technology is instore/dealer locators. Businesses use geocoded data to ascertainproximity to potential customers, distance to suppliers and competitors,service areas and delivery routes. And consumers use geocoding to find arestaurant, pet store or the nearest coffee shop. Indeed, more and moreconsumers search for retail establishments using a mapping applicationthan a general search engine. In many cases the mapping application canprovide key data with which the consumer uses to make a selection as towhich establishment they will engage. Accuracy of such data, includingthe location of the establishment, is critical.

One of the challenges for marketers working with location data is thatthere are many different uses and destinations for such data and eachrequires different formats and required fields. Search and Social sitesincluding Google®, Facebook®, Yahoo®, Bing®, Foursquare®, and Yelp® allhave specific format and data requirements. Other less known but equallyas important business applications also has unique format and fieldrequirements. And each may represent the location and content data of asingle establishment differently.

Marketing platforms use location data for targeting and directingconsumers to local outlets through paid search, display, email, andmarketing automation tools. But what if the data is inaccurate ormissing altogether? For example, 123 N. Main St. may be completelydifferent than 123 S. Main St. Consider a consumer traveling in anunfamiliar city and seeking to locate a retail establishment for lunch.Through a search the individual has located a suitable location and isusing a navigation app to arrive at the destination. But upon arrivalaccording to the application the desired location is nowhere in sight.Yet a suitable alternative is close at hand and a sale is lost. Accurateand consistent geocoding is a growing concern in commercial enterprises.

And while retail stores would rather have consumers show up at theirfront door than the loading dock, the loading dock may be a moreaccurate representation of the retail establishment's address. This sortof error is compounded by the fact that third parties may each associatea technically correct address with different latitude and longitudecoordinates. As a result, the representation of the same address amongvarious third party mapping applications can vary resulting in a widedisparity in the rendering of a point of interest. Such inconsistenciescan have a dramatic impact on sales, consumer recognition and businessefficiencies. In the same manner, inaccuracy with respect to secondaryfields of data can also adversely impact the success of a retailestablishment. Having an inaccurate telephone number or hours ofoperation can deter customers from interacting with a retail location.

What is needed is a system and associated methodology to collect correctlocational information, validate and cleanse the data, and compare itagainst third party sources so as to produce a highly reliable andaccurate body of information that can be conveyed consistently.

The system and methodology of the present invention addresses these andother needs of the prior art for collecting, validating, modifying andexporting improved locational data.

SUMMARY OF THE INVENTION

A system and associated methodology manages localization data. Accordingto one embodiment of the present invention a primary set of dataassociated with one or more specific locations is imported and matchedto a predefined format. Thereafter external, secondary, data associatedwith each of the localities listed in the primary data set is collectedfrom a plurality of third party location service providers. Thisincludes the identification and collection of potential duplicate setsof secondary data. That is, third party representation of separatelocations when in fact they are the same location. With the collectedsecondary data mapped to the same predefined format as the primary setof data, a comparison is made between the plurality of secondary datasets and the primary data set. Differences between the data sets areidentified and the primary data set modified as necessary. Thereafter,normalized data from the modified primary data set is exported to thethird party location service providers to enhance consistency andreliable of locational data.

According to one embodiment of the present invention a comparison metricis generated that identifies differences between the primary andsecondary data sets. Responsive to the comparison metric reaching apredefined threshold one or more aspects of the primary data set ismodified automatically. Moreover the metrics regarding comparison of theprimary data to that of secondary data sets and their matching betweenthe sets is historically tracked building a foundation of data on whichto base modification decisions.

Additional features of the invention can include, a method oflocalization data, wherein the modified primary set of data includes amodified set of geospatial coordinates based on differences between theprimary and plurality of secondary sets of geospatial coordinates. Themodified sets of geospatial coordinates are, in another embodiment ofthe present invention, based on weighted combinations of differencesbetween the primary and the plurality of secondary sets of geospatialcoordinates. The data is then normalized according to a predefinedformat prior to be exported to a designated third party.

According to another embodiment of the invention, a system formanagement of localization data includes a comparison engine operable tocompare the primary and secondary set of geospatial coordinate data toform a comparison metric. The comparison engine is further operable tocompare each of the plurality of secondary sets of geospatialcoordinates with the primary set of geospatial coordinates to form acomparison metric. A modification engine then operates to modify theprimary set of data and create a modified primary set of data based onthe secondary set of data; the modifications being based on a weightedcombination of differences between the primary set of geospatialcoordinates and each of the plurality of secondary sets of geospatialcoordinates.

Other features include, a normalization engine operable to convert themodified primary set of data to a predefined format consistent with oneor more third parties. An export engine is then used to export themodified set of primary data.

The features and advantages described in this disclosure and in thefollowing detailed description are not all-inclusive. Many additionalfeatures and advantages will be apparent to one of ordinary skill in therelevant art in view of the drawings, specification, and claims hereof.Moreover, it should be noted that the language used in the specificationhas been principally selected for readability and instructional purposesand may not have been selected to delineate or circumscribe theinventive subject matter; reference to the claims is necessary todetermine such inventive subject matter.

BRIEF DESCRIPTION OF THE DRAWINGS

The aforementioned and other features and objects of the presentinvention and the manner of attaining them will become more apparent,and the invention itself will be best understood, by reference to thefollowing description of one or more embodiments taken in conjunctionwith the accompanying drawings, wherein:

FIG. 1 presents a high-level block diagram of a system for localizationdata management according to one embodiment of the present invention;

FIG. 2 is a high-level flowchart of a methodology, according to oneembodiment of the present invention, to manage localization data;

FIG. 3 is a flowchart of another method embodiment for management oflocalization data according to the present invention;

FIG. 4 is a flowchart for identifying differences between a primary setof location data and a plurality of secondary sets of location dataderived from third party location services according to one embodimentof the present invention;

FIG. 5 is a rendering of a dashboard, according to one embodiment of thepresent invention, for importing a primary data set into a localizationdata management system;

FIG. 6 is a rendering of a dashboard for the validation and matching ofdata fields of a primary data set against a standardized templateaccording to one embodiment of the present invention;

FIG. 7 is a rending of geocoding of the primary locational data setaccording to one embodiment of the present invention;

FIG. 8 is a rending of one embodiment of a dashboard for the comparisonand management of localization data representing data associated with aprimary data set of a locale as compared to a plurality of secondarydata sets of the same locale from one or more third party locationservices;

FIG. 9 is a detailed view of a set of primary data set localities havingfair pin placement assessments and an associated geospatialrepresentation according to one embodiment of the present invention;

FIG. 10 is geospatial rendition of a plurality of geospatial pinsassociated with a common location according to one embodiment of thepresent invention;

FIG. 11 is a street view and corresponding geospatial rendering of thelocality of FIG. 10 with a corrected geocodes from a primary data set,according to one embodiment of the present invention;

FIG. 12 depicts a comparison of an inaccurate and/or incomplete primaryset of data with that of a plurality of secondary sets of data collectedfrom one or more third party location service providers according to oneembodiment of the present invention;

FIG. 13 shows a corrected primary set of data consistent with secondaryset of data collected from a plurality of third party location serviceproviders according to one embodiment of the present invention; and

FIG. 14 shows an updated dashboard indicating a revised comparisonmetric based on updated geocoding and external data according to oneembodiment of the present invention.

The Figures depict embodiments of the present invention for purposes ofillustration only. One skilled in the art will readily recognize fromthe following discussion that alternative embodiments of the structuresand methods illustrated herein may be employed without departing fromthe principles of the invention described herein.

DESCRIPTION OF THE INVENTION

A system and associated methodology for management of localization datacompares a primary set of data to collected third party informationrelated to the same locale. Based on user inputs and evaluation of acomparison metric to a predefined threshold, locational data can becorrected, normalized and exported to a plurality of third partiesconsistently, reliably and efficiently.

A system, according to one embodiment of the present invention,establishes a primary data source from a customer or a client. Theprimary data is imported and mapped to a standard set of fields that arerepresentative of data normally associated with localization data. Ifnecessary, data is geocoded according to a standard format and gaps inthe information generally associated with locational data is identified.Third party data associated with each locality is thereafter collectedand used to score the validity and accuracy of the primary data source.A scoring, or comparison metric as it is hereafter referred, isdetermined and while the comparison metric is not necessarily anindication of erroneous data fields it is an indication of disparitiesbetween that which a clients holds to be representative locational dataand that of one or more third parties that present locational data tothe public.

Locational data associated with the primary data source that issignificantly different from a collected body of third party informationcan, according to one embodiment of the present invention, be changedautomatically based on a comparative analysis of third party data. Theuser can also validate the accuracy of the primary data manually inlight of third party data using a workbench or dashboard. Thereafter thevalidated and, if necessary, modified data is normalized and presentedto various third party applications in a format consistent with thosethird parties.

Embodiments of the present invention are hereafter described in detailwith reference to the accompanying Figures. Although the invention hasbeen described and illustrated with a certain degree of particularity,it is understood that the present disclosure has been made only by wayof example and that those skilled in the art can resort to numerouschanges in the combination and arrangement of parts without departingfrom the spirit and scope of the invention.

The following description with reference to the accompanying drawings isprovided to assist in a comprehensive understanding of exemplaryembodiments of the present invention as defined by the claims and theirequivalents. It includes various specific details to assist in thatunderstanding but these are to be regarded as merely exemplary.Accordingly, those of ordinary skill in the art will recognize thatvarious changes and modifications of the embodiments described hereincan be made without departing from the scope and spirit of theinvention. Also, descriptions of well-known functions and constructionsare omitted for clarity and conciseness.

The terms and words used in the following description and claims are notlimited to the bibliographical meanings, but are merely used by theinventor to enable a clear and consistent understanding of theinvention. Accordingly, it should be apparent to those skilled in theart that the following description of exemplary embodiments of thepresent invention are provided for illustration purpose only and not forthe purpose of limiting the invention as defined by the appended claimsand their equivalents.

By the term “substantially” it is meant that the recited characteristic,parameter, or value need not be achieved exactly, but that deviations orvariations, including for example, tolerances, measurement error,measurement accuracy limitations and other factors known to those ofskill in the art, may occur in amounts that do not preclude the effectthe characteristic was intended to provide.

Like numbers refer to like elements throughout. In the figures, thesizes of certain lines, layers, components, elements or features may beexaggerated for clarity.

The terminology used herein is for the purpose of describing particularembodiments only and is not intended to be limiting of the invention. Asused herein, the singular forms “a,” “an” and “the” are intended toinclude the plural forms as well, unless the context clearly indicatesotherwise. Thus, for example, reference to “a component surface”includes reference to one or more of such surfaces.

As used herein any reference to “one embodiment” or “an embodiment”means that a particular element, feature, structure, or characteristicdescribed in connection with the embodiment is included in at least oneembodiment. The appearances of the phrase “in one embodiment” in variousplaces in the specification are not necessarily all referring to thesame embodiment.

As used herein, the terms “comprises,” “comprising,” “includes,”“including,” “has,” “having” or any other variation thereof, areintended to cover a non-exclusive inclusion. For example, a process,method, article, or apparatus that comprises a list of elements is notnecessarily limited to only those elements but may include otherelements not expressly listed or inherent to such process, method,article, or apparatus. Further, unless expressly stated to the contrary,“or” refers to an inclusive or and not to an exclusive or. For example,a condition A or B is satisfied by any one of the following: A is true(or present) and B is false (or not present), A is false (or notpresent) and B is true (or present), and A and B are both true (orpresent).

Unless otherwise defined, all terms (including technical and scientificterms) used herein have the same meaning as commonly understood by oneof ordinary skill in the art to which this invention belongs. It will befurther understood that terms, such as those defined in commonly useddictionaries, should be interpreted as having a meaning that isconsistent with their meaning in the context of the specification andrelevant art and should not be interpreted in an idealized or overlyformal sense unless expressly so defined herein. Well-known functions orconstructions may not be described in detail for brevity and/or clarity.

It will be also understood that when an element is referred to as being“on,” “attached” to, “connected” to, “coupled” with, “contacting”,“mounted” etc., another element, it can be directly on, attached to,connected to, coupled with or contacting the other element orintervening elements may also be present. In contrast, when an elementis referred to as being, for example, “directly on,” “directly attached”to, “directly connected” to, “directly coupled” with or “directlycontacting” another element, there are no intervening elements present.It will also be appreciated by those of skill in the art that referencesto a structure or feature that is disposed “adjacent” another featuremay have portions that overlap or underlie the adjacent feature.

Included in the description are flowcharts depicting examples of themethodology that may be used to manage localization data. In thefollowing description, it will be understood that each block of theflowchart illustrations, and combinations of blocks in the flowchartillustrations, can be implemented by computer program instructions.These computer program instructions may be loaded onto a computer orother programmable apparatus to produce a machine such that theinstructions that executes on the computer or other programmableapparatus create means for implementing the functions specified in theflowchart block or blocks. These computer program instructions may alsobe stored in a computer-readable memory that can direct a computer orother programmable apparatus to function in a particular manner suchthat the instructions stored in the computer-readable memory produce anarticle of manufacture including instruction means that implement thefunction specified in the flowchart block or blocks. The computerprogram instructions may also be loaded onto a computer or otherprogrammable apparatus to cause a series of operational steps to beperformed in the computer or on the other programmable apparatus toproduce a computer implemented process such that the instructions thatexecute on the computer or other programmable apparatus provide stepsfor implementing the functions specified in the flowchart block orblocks.

Accordingly, blocks of the flowchart illustrations support combinationsof means for performing the specified functions and combinations ofsteps for performing the specified functions. It will also be understoodthat each block of the flowchart illustrations, and combinations ofblocks in the flowchart illustrations, can be implemented by specialpurpose hardware-based computer systems that perform the specifiedfunctions or steps, or combinations of special purpose hardware andcomputer instructions.

Some portions of this specification are presented in terms of algorithmsor symbolic representations of operations on data stored as bits orbinary digital signals within a machine memory (e.g., a computermemory). These algorithms or symbolic representations are examples oftechniques used by those of ordinary skill in the data processing artsto convey the substance of their work to others skilled in the art. Asused herein, an “algorithm” is a self-consistent sequence of operationsor similar processing leading to a desired result. In this context,algorithms and operations involve the manipulation of informationelements. Typically, but not necessarily, such elements may take theform of electrical, magnetic, or optical signals capable of beingstored, accessed, transferred, combined, compared, or otherwisemanipulated by a machine. It is convenient at times, principally forreasons of common usage, to refer to such signals using words such as“data,” “content,” “bits,” “values,” “elements,” “symbols,”“characters,” “terms,” “numbers,” “numerals,” “words”, or the like.These specific words, however, are merely convenient labels and are tobe associated with appropriate information elements.

Unless specifically stated otherwise, discussions herein using wordssuch as “processing,” “computing,” “calculating,” “determining,”“presenting,” “displaying,” or the like may refer to actions orprocesses of a machine (e.g., a computer) that manipulates or transformsdata represented as physical (e.g., electronic, magnetic, or optical)quantities within one or more memories (e.g., volatile memory,non-volatile memory, or a combination thereof), registers, or othermachine components that receive, store, transmit, or displayinformation.

FIG. 1 presents a high-level block diagram of a system for localizationdata management according to one embodiment of the present invention.The system 100 includes an import engine 110, a collection engine 120, amanagement engine 160 and an export engine 170. The management engine160 is further comprised of modules operable to compare and matchlocalization data 130 as well as modify 140 such data when necessary.Lastly the management engine 160 includes a normalization engine 150 toplace localization data into third party specific format prior toexportation.

The import engine 110 includes, in one embodiment, a portal by which aprimary data set is supplied by a client. Data can be introduced intothe system in a variety of formats; raw data fields and spreadsheetsincluding a CVS file having locational data are also imported. Theimport engine 110 accepts the information in client format and maps thedata to a predefined format of industry-accepted fields. These core setsof fields serve as the basis for each location and include, among otherthings, name, address, location, phone number, operating hours, etc. Theimport engine 110 maps the supplied data to each of these fieldsregardless of how they are named or arranged based on commoncharacteristics. For example an address field typically includes anumerical value, a name of street, avenue or boulevard as well as city,state, country and postal code. While each client's format and label maydiffer the import engine 110 parses the data so as to place it in thecorrect field. And in some instances data presented by a client istransformed into manageable and understandable packets so that it can beproperty mapped into the standard fields.

The present invention also accepts and store other information that,while not considered necessary with respect to localization data, isnonetheless descriptive of the clients retail or business establishment.Information of this type can include URLs, service descriptions, menus,directions, etc.

One aspect of the present invention is to provide feedback to the clientas to the depth and health of the client's locational data. Accordingly,the primary data set is analyzed against a set of metrics to determineif the data provides basic information such as name, brand, address,phone, etc. and well as completeness. For example the data supplied maybe missing a basic field such as hours of operation. At the same time,while the data set includes a field for phone number for each locationmay of the localities may find this information to be absent indicatingthat the primary data set in incomplete. According to one embodiment ofthe present invention the import engine 110 determines and conveys tothe client a metric representing a degree of basic information that hasbeen provided as well as a degree of completeness of that information.This internal data metric provides the client with feedback as to therobustness and completeness of their primary data set, exclusive of itscomparison to any third parties. Entries that are either incomplete orlacking in basic information are also flagged so that the client cansupply additional information to aid in the effectiveness of the system.

Client imported data, hereafter referred to as a primary data set, isretained in a database and modified as necessary. In addition to storingthe data supplied by the client it may be necessary to geocode one ormore locations. While the import engine 110 and system 100 is capable ofaccepting client presented geocodes, the import engine 110 is alsooperable to geocode supplied addresses into an exact latitude andlongitude coordinates (or similar geospatial codes) that are needed formapping and positioning.

Having gained a primary set of data from the client that is mapped to apredefined or standard set of fields; the system for management oflocalization data 100 further collects third party localization data viathe collection engine 120. Each entry of a client's primary set of dataincludes a specific locality. That locality may represent a retailestablishment or a similar commercial location. According to oneembodiment of the present invention, third party locational data of thesame locality is collected and compared to the primary set of data. Thisplurality of third party secondary data sets each presents a uniquerepresentation of the same locality. And in some instances duplicatesare identified. That is, representations that appear to be differentlocalities yet are in fact the same location. Despite a client'sposition as being able to provide the most accurate and reliable sourceof data regarding one of its localities, third party vendors thatprovide such locality information develop their data regarding such alocality independently. Each possesses different algorithms, protocolsand policies by which it collects, verifies and publishes localizationdata. As a result significant differences in the data between such thirdparties can exist and, in some instances, result in the creation ofduplicate sets of data. One objective of the present invention is toidentify and correct disparities between these localizationrepresentations.

Using a locality of the primary data set as a basis for the search, thecollection engine 120 will initiate an inquire to third party vendorssuch as Google®, Bing®, Yahoo®, Foursquare® and the like, to gainsecondary localization data consistent with the fields mapped by theprimary data set. For example address information, geocodes, hours ofoperation, branding data and the like for each locality are individuallycollected and stored.

As will be appreciated by one skilled in the art, not all data sought bythe collection engine is readily available from third party sites. Whilea third party site may provide a graphical representation of a locationon a map it may not provide exact geocodes. Thus the collection engine120 uses a plurality of different inquires to identify data necessary toconduct a robust and effective comparison of localization data.

FIG. 2 presents a basic flowchart, according to one embodiment of thepresent invention, of a methodology to manage localization data. Asshown the process begins 205 with the importation of a primary data set.That data is then compared to collected, secondary, third party dataand, if warranted, modified 230. The role of modification of the primarydata set falls to the management engine 160 shown in FIG. 1.

As previously discussed the management engine 160 includes three modulesor engines. They are the comparison and matching engine 130, themodification engine 140, and the normalization engine 150. Each of thesecomponents of the management engine 160 works to create an accuratemodified version of the primary data set that can be exported.

One of reasonable skill in the relevant art will appreciate that thedepictions of the various engines and modules of FIG. 1 or otherdrawings herein is arbitrary and does not in any way indicate separableor incongruent characteristics of the present invention. The names andgraphical depictions are used as a means to describe and demonstrate thefunctionality of the present invention and have no direct correlation asto the structure of any underlying software, hardware or firmwarenecessary to implement the invention claimed hereafter.

Turing attention back to FIG. 1, the management engine 160 receives datafrom the import engine 110 and the collection engine 120 and engages ina comparison to ascertain a degree of accuracy for each locality. Thisprocess begins with the comparison and matching engine 130 that conductsa field-by-field comparison of the primary data set to each third partysecondary data set. As with the primary data set, each secondary dataset from each third party must also be mapped to a standard set offields. Once mapped, each field for each secondary data set can becompared against the corresponding field of the primary data set.

The results of this comparison yield an external data metric. Thisexternal data metric combined with the internal data metric and a yet tobe described geocode comparison, to arrive at an overall comparisonmetric or score. This score provides the client with an indication ofthe overall health of their locational data and how well they areeffectively providing locational data regarding their establishments tothe public.

As mentioned, the comparison engine 130 analyzes for a locality eachfield of a secondary data set against the same field of the primarydata. When a difference is identified that field and third party sourceis flagged as being different from the primary source. For example, thephone number listed by the third party for a certain locality may differthan what is provide by the primary data source. Yet another third partysource of information may have the correct telephone number but thehours of operation are incorrect. Each of these third party sourceswould be flagged and the aberrant fields highlighted.

Another feature of the present invention is the comparison of pinplacement for each locality. Pin placement refers to the geocode(latitude and longitude) associated with each locality. It is possibleand often likely that the geocode of the primary data source differsfrom one or more of the geocodes of the secondary data set. This is trueeven when each secondary data set includes the same address. Accordingto one embodiment of the present invention, the comparison and matchingengine 130 identifies discrepancies with pin placements between theprimary data set and one or more of the secondary data sets provided bythird parties. As with external data, differences in pin placement areflagged so that the client can review and if necessary correct the pinplacement of the primary data source.

Comparison of external data is, as described, distinct. A mistake foundin a phone number or an address is flagged as being in accurate andpresented for correction. Pin placement however includes a subjectiveelement. For example, it is highly likely that none of the third partygeocodes for a particular locality will exactly match the geocode in theprimary data set. Yet a certain degree of difference is likelyacceptable. For example, if a geocode for a secondary data set is within10 feet of that of the primary data set, and both are near the actuallocation, it is likely within a range of accuracy acceptable by theclient. Similarly a secondary geocode that presents a pin ½ mile fromthe true location of the establishment presents a different scenario.

A similar challenge is presented when a plurality of geocodes presentedby third parties appear to be consistently different than the geocodesof the primary data set. According to one embodiment of the presentinvention the comparison and matching engine 130, in conjunction withthe modification engine 140, determines, based in one embodiment on aweighted average of differences among the geocodes of the primary dataset and the secondary data sets, an accepted geocode. This acceptedgeocode could differ from any of the presented geocodes including thatof the primary geocode. And according to one embodiment of the presentinvention the modification engine can not only present revised data tothe applicable third parties but also revise the primary data set.

The modification engine 140 and the comparison and matching engine 130can also identify outliers that are not used in such a determination.Moreover each third party representation of a particular value may notbe given equal weight. One of reasonable skill in the relevant art willrecognize that a primary data set may possess thousands of localities.And while it is possible that for each locality a user may review howthe primary data set differs from each of the secondary data setscollected from third parties, such an endeavor is often impractical. Thepresent invention captures third party data and analyzes that data alongwith the primary data provided by the client to determine the mostlikely and most accurate representation of localization data. And whilea weighted average of the collected data can be used to determine a newor modified set of primary data, other techniques known to one ofreasonable skill in the relevant art can be used and are contemplated bythe present invention.

In yet another embodiment of the present invention the primary data setcan include verified geocodes. While most geocodes are determined basedon an location's address, one embodiment of the present inventionenables a client to accept a verified set of coordinates from a trustedsource. For example a client may instruct an employee at the site inquestion to transmit exact geocodes from the establishment. The data canthen be entered as being a verified set of geocodes inhibiting anyfurther modification regardless of the comparison metric.

Once modified (or verified), the primary set of data must be exported tothe third parties. However each third party possess a specific formatfor such localization information. The normalization engine 150 acceptsdata from the comparison and matching engine 130 as well as themodification engine 140 to create a normalized 250 set of data. Thatdata is thereafter exported 270 via the export engine 170 to one or morethird parties for consideration and implementation.

Turing attention back to FIG. 2, the management of localization databegins 205 with the importation and collection 210 of primary andsecondary data sets respectively. A comparison takes place, metrics aredetermined and in some cases the primary set is modified 230 to reflecta more accurate statement of the locality. The now modified primary dataset is then normalized 250 so that is can be exported 270 to one or morethird parties, each in a unique format, ending the process 295.

FIGS. 3 and 4 are flowcharts presenting a high-level outline of theprocess by which localization data is managed according to the presentinvention. The process outlined in FIG. 3 beings 305 with the mapping310 of imported data to a standard format as previously described. Theprimary data set is thereafter validated 320 against industry norms forlocational data to determine whether the primary data set lacks certainfields and if certain entries are simply absent. In situations in whichgeocode data is not provided the addresses provided in the primary dataset are geocoded 330 so as to provide a primary set of latitude andlongitude coordinates.

For each locality within the primary set of data, third party data isthereafter collected 340. The data is also imported into a standardformat so that similar fields of data can be compared. The collectedthird party data is then compared 350 to the primary data set toidentify differences between that which the clients possesses and thatwhich is associated with a particular locality by a plurality of thirdparties. The comparison is scored 360 as is the validation process andanalysis of geocodes to arrive at a metric by which a client can assessthe health of their localization data. Using third party data and userinputs via a dashboard, missing or erroneous data is corrected 370 sothat it can thereafter be normalized and exported, ending the process395.

FIG. 4 shows an iterative process, according to one embodiment of thepresent invention, by which the primary data set is modified based onits comparison with third party data. As discussed above, the processbegins 405 with the identification of differences between the data sets410. With both the primary data set and a plurality of secondary datasets each mapped to a standard format, the management system identifiesdifferences in the data for each locality. Using these differences acomparison metric 420 is generated. This metric, while indicative of thehealth of the primary set of data also can be compared against apredefined threshold to determine whether the primary data set should bemodified. These modifications can be with respect to external data or tothe geocodes associated with each locality.

If the inquiry 450 as to whether the primary data set should be modifiedis affirmative, the system, according to one embodiment of the presentinvention, modifies 460 specific fields of data so at minimize oreliminate the differences. Another analysis is conducted and the processrepeated until the primary data set represents an accurate reflection ofthe locality data.

With no other modifications needed the process concludes 495 with themodified primary data set being normalized and exported 470 to the thirdparties as necessary.

One skilled in the relevant art will appreciate that determining when tomodify the primary set of data based on a plurality of differentsecondary data sets requires careful analysis. Arguably the provider ofthe primary set of data is in the best position to provide accuratelocalization data. However, there may be instances in which based on acompilation of secondary data with respect to the same locality, theaccuracy of the primary data is raised into question. The presentinvention measures such an instance and, when warranted, modifies theprimary data set without further input from the user.

FIGS. 5-14 detail an exemplary process by which localization data isimported into the localization management system, third party data iscollected and compared to the data, and modifications to the primarydata set occur.

FIG. 5 depicts, according to one embodiment of the present invention,the selection and importation of a primary data file. In this example aclient, the owner of a restaurant chain with multiple locations,identifies 510 a primary data file that includes basic informationregarding each restaurant location. Alternatively the user can drag anddrop a file such as a CVS file to an upload portal 520 to achieve thesame result. Other means by which to provide a primary set of data arecontemplated and should not be viewed as a limitation to the presentinvention. For example an automated uploading of data from a client sitevia an API can provide near real time access to newly added fields andchanges to locational data. Similarly changes to locational data made bythe system of the present invention can be downloaded back to the clientsite to maintain a consistent set of data across the system interface.

As the data is uploaded the import engine 110 identifies the fields ofdata 530 associated with each locality according to a set of industrynorms. In this case each restaurant listed in the primary data setincludes fields such as store code, name, address, city, state, postalcode, country code, phone number, fax number, home page URL, hours ofoperation, latitude, longitude, a category code, images associated withthe location and a general description.

FIG. 6 represents a matching between the plurality of fields identifiedin the primary data set and those of a predefined standard 610. In manycases the names of fields used by a client are the same as those used inthe management system of the present invention, however it is importantto align the primary data set with a standardized format to ensure thatthe later comparison can be properly appreciated.

In this case for example the fields for name, city and postal codeappear to be identical while the fields of street (Address Line 1) andphone number (main phone) differ. Note that not all of the informationprovided within the primary data set maps to the standard format. Ifnecessary additional fields can be added to the standard format if suchinformation is deemed pertinent to the comparison.

A significant aspect of the importation of data is brand or locationname recognition. Many companies place significant value on a preciserendering of their brand as associated with locational data. Accordinglypart of the importation step is to specifically identify 620 a brand orname that is associated with each locality. In this case the name,“Mexican Grill” is associated with each locality.

Finally the data is geocoded as shown in FIG. 7. Geocoding identifies720 the latitude and longitude associated with each locality. Accordingto one embodiment of the present invention the address provided by theclient as part of the primary data set is verified against a nationaldatabase of accepted addresses. Once verified the address is geocoded toidentify a latitude and longitude. In instances in which no geocoding isprovided with the primary data set, the newly determined geocodes areadded to the dataset and used for later comparisons. In the case inwhich a primary data set includes geocodes, the system can, according toone embodiment of the present invention, independently determine a newset of geocodes based on the provided and verified address. If thegeocodes associated with the primary data set are significantlydifferent than those determined by the system, the locality is flaggedfor further review or, according to one embodiment, modified based onthe accepted and verified address.

Once the primary data set is imputed into the system 100 the collectionengine 120 gathers secondary data sets regarding the same location froma plurality of third parties. Parties such as Google®, Foursquare®,Factual®, Facebook®, Yellow Pages®, Bing®, Yelp® and the like arequeried for data regarding a specific locality. The collection enginealso searches and identifies potential duplicate secondary data sets.Often the processes used by third parties to create and maintain asecondary data set of a single location branch creating multiple datasets related to the same locality. From the third party's perspectiveeach is a unique location with unique characteristics, yet all refer tothe same primary data set. One objective of the present invention is toidentify duplicate secondary data sets and merge them into a single,accurate data set consistent with the primary data. With secondary datain hand, the collected data is mapped to the same standard set of fieldsand compared against the primary data set. The comparison produces,among others things, a dashboard presenting to the user a summary of thecomparison analysis.

FIG. 8 presents, according to one embodiment of the present invention, arendering of a dashboard showing the comparison of a primary set oflocational data to a plurality of third party data sets. The dashboard800 presents a summary panel that includes an overall comparison metric820 and individual contributing scores. These contributing scoresinclude metrics with respect to data provided 830, the location of thegeocodes (pins) 835 and external data 840.

The dashboard also includes more detailed information regarding thecompleteness and state of the primary data set 850 as well as anoverview of the pin placement 860 in comparison to third party data.Lastly the dashboard 800 provides information of how the external data870 compares to that of the primary data set for each party.

In this case the primary data set includes 20 locations and has anoverall score or comparison metric of 76. In this embodiment of thepresent invention a score of 100 indicates a perfect correlation betweenthe primary set of data and all third party data while 0 indicates alack of correlation. Other means by which to measure and convey thehealth of the primary data are indeed possible and contemplated by thepresent invention.

The dashboard also indicates that the overall score of 76 includescontributing scores of 80 with respect to the completeness and scope ofthe primary data set, 53 for the correlation of the geocodes associatedwith the primary data set as compared to those of the third partysources, and 90 for external data. Thus, in this case, one can concludethat the primary data appears to be relatively complete and externaldata of third party sources appears to mostly match that presented bythe primary data set. However, geocoding associated with the third partydata sets as compared to that of the primary data set shows significanterrors.

Lastly the external data panel 870 depicts inconsistencies between thethird party sources. While none of the third party sources exactlymatches that of the primary data source, three seem to possess a veryhigh correlation, two are mediocre and two are outliers.

In each instance the dashboard enables the user to drill down to eachlocality so as to determine how a particular locality differs from thethird party data or from the standard format. For example in the primarydata set panel 850, 6 locations are flagged as not meeting the standardformat requirements for either scope or completeness. One can selecteach of these locations and determine what data is missing or recordedin error. For example, one of the flagged localities of the primary dataset may have the phone field blank despite the fact that the other datais complete.

Similarly, the pin placement (geocoding) panel 860 identifies pinplacement as confirmed, good, fair or poor. The spread of the pins mayalso be characterized as being close, acceptable, or scattered. In thiscase 15% of the localities are associated with poor pin placement, 10%fair, 50% good and 25% confirmed. As shown in FIG. 9 the system enablesa user to ascertain more detail as to why a particular locality's pinplacement has been assessed as fair, poor or good.

FIG. 9 presents, according to one embodiment of the present invention, arendering of a pin placement analysis for locational data. In theexample shown three locations out of 20 have been determined to possessfair pin placement. Such an assessment is determined, in one embodiment,by assuming that the pin placement of the primary data source is trueand determining a collective degree of difference between each of thethird party pins to the primary pin. In another embodiment the center ofa cluster of pins can be determined to be the true value and differencesmeasured from that location.

In this case three locations 920 have been designated as having fair pinplacement.

The first of the three Mexican Grills is located at 154 Hutchinson Ave,in Columbus Ohio. A satellite image 930 of the vicinity of the addressshows the placement of each pin along with a representative avatar. Inthis case the primary data pin 940 is located on Hutchinson Ave as isthe pin associated with YP® 945. The Factual pin 950 is located near theintersection of Hutchinson Ave and High Cross Blvd while the Google® pin960, Bing® 970 and Facebook® 980 are appear near a building. The graphicthus represents not only a fair rendering of the geocoding but one thatis scattered among the various third party data files. In this instancethe Google® pin 960 is most representative of the actual location of therestaurant. A user therefore can drag the primary pin 940 to coincidewith that of Google 960 and Bing 970. By doing so the primary data setis modified and the flag removed.

FIG. 10 presents another rendering 1010 of a comparison of locationaldata according to one embodiment of the present invention. In this casethe Mexican Restaurant located on Youngfield St. in Wheat Ridge, Colo.920 is depicted, as are 7 different pin locations. Again the pinplacement has been assessed as fair. Here the primary data pin 1020 (asis YP® 1040 and FourSquare® 1030) is located on the street (YoungfieldSt.) as opposed to being by the retail establishment 1090. However theimage in FIG. 10 shows a clear cluster of pins near the same location.In this case Google® 1060, Bing® 1080, Factual® 1070, and Facebook 1050are near the actual restaurant location 1090. According to oneembodiment of the present invention, a comparison of the geocoding datacan identify a close correlation of several geocodes at the samelocality. Each of the third party sources can additionally posses arating or score to indicate the confidence on which the system values aparticular geocoding. A close correlation by a highly valued third partysources can also be scored and trigger an automatic modification of theprimary data geocoding so as to match a central location amongst thecluster.

In this example the assessed value of the geocodes of Google® 1060,Factual® 1070, Facebook® 1050 and Bing® 1080, the fact that each ofthese pins are clustered within a predefined area, and the low valueplace on YP® 1040 and Foursquare® 1030, drive the system toautomatically relocate the primary data geocoding pin to a centrallocation near the actual position of the restaurant 1090.

FIG. 11 presents two additional views of the interface for repositioningof locational data according to one embodiment of the present invention.FIG. 11 presents a street image of the cluster 1190 of third party pins1060, 1070, 1050, as well as the repositioned primary pin 1120 near thefront door of the retail location. The lower top satellite view 1010shows a relocated primary data pin 1120 near positioned within thecluster of third party pins 1050, 1060, 1070, 1080.

In such a manner, inaccurate geocoding of the primary data source can beautomatically modified. Moreover, the user can be notified of such amodification for later validation rather than having to manually revieweach pin placement analysis that is flagged as poor or fair before achange occurs. While the present example depicts a primary data sourcehaving 20 restaurants, other data sources may have several thousandlocalities. For example it is estimated that Bank of America manages thelocation of over 18,000 ATMs worldwide. In instances in which acorrelation of trusted third party data indicates that the primarygeocode is in error, the information can be efficiently and effectivelyupdated using the present invention.

According to another embodiment of the present invention the same sortof cleansing of data associated with a primary data source can occurwith respect to other field. FIG. 12 presents a rendering of missingdata being supplied by trusted third party sources. In this case theMexican Grill in Chandler, Ariz. has been flagged 1210 as lacking aphone number. Indeed the phone number 1230 of the primary data sourcehas only 9 digits.

Yet each of three trusted third party sources 1240, 1250, 1260 indicatesnot only a high correlation with the data that is present with theprimary data source (480-783-020 vs. 480-783-0200) but each is the same.As with geocodes, the reliability of each third party source can beaccessed as can the correlation between various third party sources. Analgorithm can determine based on the accessed value and correlationwhether a predefined threshold has been reached. If so, the valuesconsistently associated with the secondary data sources can be used tomodify the primary data source. Once the primary data set 1310 isupdated, as shown in FIG. 13, the flags associated with each secondarydata set 1320 can be replaced with a validation mark showing aconsistency between the data shown on each secondary data set 1240,1250, 1260 and the primary data set 1310.

One object of the present invention is to import a primary set oflocalization data and compare that data to a plurality of secondary datasets derived from unique third party sources. Data from each source, beit the primary data set or the secondary data sets, are mapped to astandard set of fields wherein discrepancies are determined and conveyedto a user via a dashboard. The comparison of the primary data set to theplurality of secondary data sets produces a comparison metric indicativeof the health of the primary data set. While a user can use thedashboard to investigate and manually correct any error, in oneembodiment, the present invention can autonomously determine whether theprimary data set should be modified to reflect data held by one or moreof the secondary data sets. Once modified, either manually orautomatically, the new primary data set can be normalized and exportedto each third party in a format uniquely acceptable to that entity.

Another feature of the present invention is the ability to update aparticular field among a widespread number of third party locationproviders. For example a business entity that has elected to increasethe hours of operation in a geographic region can modify the primarydata set with respect to those localities and then export that data toeach third party enterprise that provide locational data. In the sameway widespread changes to company wide information can be efficientlysubmitted to each provider before the third party can ascertain thechange through their independent methods.

Those familiar with the relevant art will understand that the inventionmay be embodied in other specific forms without departing from thespirit or essential characteristics thereof. Likewise, the particularnaming and division of the modules, managers, functions, systems,engines, layers, features, attributes, methodologies, and other aspectsare not mandatory or significant, and the mechanisms that implement theinvention or its features may have different names, divisions, and/orformats. Furthermore, as will be apparent to one of ordinary skill inthe relevant art, the modules, managers, functions, systems, engines,layers, features, attributes, methodologies, and other aspects of theinvention can be implemented as software, hardware, firmware, or anycombination of the three. Of course, wherever a component of the presentinvention is implemented as software, the component can be implementedas a script, as a standalone program, as part of a larger program, as aplurality of separate scripts and/or programs, as a statically ordynamically linked library, as a kernel loadable module, as a devicedriver, and/or in every and any other way known now or in the future tothose of skill in the art of computer programming. Additionally, thepresent invention is in no way limited to implementation in any specificprogramming language, or for any specific operating system orenvironment. Accordingly, the disclosure of the present invention isintended to be illustrative, but not limiting, of the scope of theinvention, which is set forth in the following claims.

In a preferred embodiment, the present invention can be implemented insoftware and is web based. Software programming code that embodies thepresent invention is typically accessed by a microprocessor fromlong-term, persistent storage media of some type, such as a flash driveor hard drive. The software programming code may be embodied on any of avariety of known media for use with a data processing system, such as adiskette, hard drive, CD-ROM, or the like. The code may be distributedon such media, or may be distributed from the memory or storage of onecomputer system over a network of some type to other computer systemsfor use by such other systems. Alternatively, the programming code maybe embodied in the memory of the device and accessed by a microprocessorusing an internal bus. The techniques and methods for embodying softwareprogramming code in memory, on physical media, and/or distributingsoftware code via networks are well known and will not be furtherdiscussed herein.

Generally, program modules include routines, programs, objects,components, data structures and the like that perform particular tasksor implement particular abstract data types. Moreover, those skilled inthe art will appreciate that the invention can be practiced with othercomputer system configurations, including hand-held devices,multi-processor systems, microprocessor-based or programmable consumerelectronics, network PCs, minicomputers, mainframe computers, and thelike. The invention may also be practiced in distributed computingenvironments where tasks are performed by remote processing devices thatare linked through a communications network. In a distributed computingenvironment, program modules may be located in both local and remotememory storage devices.

An exemplary implementation of the present invention may also beexecuted in a Web environment, where software installation packages aredownloaded using a protocol such as the Hypertext Transfer Protocol(HTTP) from a Web server to one or more target computers (devices,objects) that are connected through the Internet. Alternatively, animplementation of the present invention may be executing in othernon-Web networking environments (using the Internet, a corporateintranet or extranet, or any other network) where software packages aredistributed for installation using techniques such as Remote MethodInvocation (“RMI”) or Common Object Request Broker Architecture(“CORBA”). Configurations for the environment include a client/servernetwork, as well as a multi-tier environment. Furthermore, it may happenthat the client and server of a particular installation both reside inthe same physical device, in which case a network connection is notrequired. (Thus, a potential target system being interrogated may be thelocal device on which an implementation of the present invention isimplemented.)

The present invention is a web-based tool that helps marketers andagencies make their location data accurate, accessible and usable. Thepresent invention embodies a set of tools to improve and prepare datausing structured and repeatable workflows. The management system of thepresent invention maintains a central, authoritative repository oflocation data that can be accessed and manipulated by a user whilemaking that data available across digital marketing channels such assearch, social, maps, and mobile access points.

While there have been described above the principles of the presentinvention in conjunction with a localization management system and itsassociated methodology, it is to be clearly understood that theforegoing description is made only by way of example and not as alimitation to the scope of the invention. Particularly, it is recognizedthat the teachings of the foregoing disclosure will suggest othermodifications to those persons skilled in the relevant art. Suchmodifications may involve other features that are already known per seand which may be used instead of or in addition to features alreadydescribed herein. Although claims have been formulated in thisapplication to particular combinations of features, it should beunderstood that the scope of the disclosure herein also includes anynovel feature or any novel combination of features disclosed eitherexplicitly or implicitly or any generalization or modification thereofwhich would be apparent to persons skilled in the relevant art, whetheror not such relates to the same invention as presently claimed in anyclaim and whether or not it mitigates any or all of the same technicalproblems as confronted by the present invention. The Applicant herebyreserves the right to formulate new claims to such features and/orcombinations of such features during the prosecution of the presentapplication or of any further application derived therefrom.

We claim:
 1. A method for management of localization data, the methodcomprising: importing a primary set of data associated with a location;associating a secondary set of data with the location wherein thesecondary set of data includes a plurality of third party data sets;generating a comparison metric by identifying differences between thesecondary set of data and the primary set of data; and responsive to thecomparison metric reaching a predefined threshold, creating a modifiedprimary set of data based on the secondary set of data.
 2. The methodfor management of localization data according to claim 1, wherein theprimary set of data includes a primary set of geospatial coordinates. 3.The method for management of localization data according to claim 1,further comprising matching the primary set of data with a predefinedformat.
 4. The method for management of localization data according toclaim 1, wherein the secondary set of data includes a plurality ofsecondary sets of geospatial coordinates.
 5. The method for managementof localization data according to claim 4, wherein generating includesidentifying differences between each of the plurality of secondary setsof geospatial coordinates and the primary set of geospatial coordinates.6. The method for management of localization data according to claim 5,wherein the modified primary set of data includes a modified set ofgeospatial coordinates based on difference between the primary set ofgeospatial coordinates and the plurality of secondary sets of geospatialcoordinates.
 7. The method for management of localization data accordingto claim 6, wherein the modified set of geospatial coordinates is basedon a weighted combination of difference between the primary set ofgeospatial coordinates and each of the plurality of secondary sets ofgeospatial coordinates.
 8. The method for management of localizationdata according to claim 1, further comprising normalizing the modifiedprimary set of data to a predefined format set of primary data.
 9. Themethod for management of localization data according to claim 1, furthercomprising exporting the predefined format set of primary data.
 10. Themethod for management of localization data according to claim 1, furthercomprising collecting the secondary set of data from the plurality ofthird parties.
 11. The method for management of localization dataaccording to claim 1, wherein the comparison metric is based on aweighted average of differences between the secondary set of data andthe primary set of data.
 12. A system for management of localizationdata, the system comprising: a primary set of data associated with alocation; a secondary set of data associated with the location whereinthe secondary set of data includes a plurality of third party data sets;a comparison engine operable to compare the secondary sets of data withthe primary set of data to form a comparison metric; and a modificationengine operable to modify the primary set of data, and whereinresponsive to the comparison metric reaching a predefined threshold themodification engine creates a modified primary set of data based on thesecondary set of data.
 13. The system for management of localizationdata according to claim 12, wherein the primary set of data includes aprimary set of geospatial coordinates.
 14. The system for management oflocalization data according to claim 12, wherein the secondary set ofdata includes a plurality of secondary sets of geospatial coordinates.15. The system for management of localization data according to claim14, wherein the comparison engine is operable to compare each of theplurality of secondary sets of geospatial coordinates with the primaryset of geospatial coordinates to form the comparison metric.
 16. Thesystem for management of localization data according to claim 15,wherein the modified primary set of data includes a modified set ofgeospatial coordinates based on the plurality of secondary sets ofgeospatial coordinates.
 17. The system for management of localizationdata according to claim 16, wherein the modified set of geospatialcoordinates is based on a weighted combination of difference between theprimary set of geospatial coordinates and each of the plurality ofsecondary sets of geospatial coordinates.
 18. The system for managementof localization data according to claim 12, further comprising anormalization engine operable to convert the modified primary set ofdata to a predefined format set of primary data.
 19. The system formanagement of localization data according to claim 12, furthercomprising an export engine operable to export the predefined format setof primary data.
 20. The system for management of localization dataaccording to claim 12, wherein the comparison metric is based on aweighted average of differences between the secondary set of data andthe primary set of data.
 21. The system for management of localizationdata according to claim 12, further comprising a collection engineoperable to identify and collect secondary data associated with thelocation from a plurality of third parties.